Autonomous agent learning using an actor-critic algorithm and behavior models

نویسنده

  • Victor Uc Cetina
چکیده

We introdu e a Supervised Reinfor ement Learning (SRL) algorithm for autonomous learning problems where an agent is required to deal with high dimensional spa es. In our learning algorithm, behavior models learned from a set of examples, are used to dynami ally redu e the set of relevant a tions at ea h state of the environment en ountered by the agent. Su h subsets of a tions are used to guide the agent through promising parts of the a tion spa e, avoiding the sele tion of useless a tions. The algorithm handles ontinuous states and a tions. Our experimental work with a di ult robot learning task shows learly how this approa h an signi antly speed up the learning pro ess and improve the nal performan e.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Analysis of Actor/Critic Algorithms Using Eligibility Traces: Reinforcement Learning with Imperfect Value Function

We present an analysis of actor/critic algorithms, in which the actor updates its policy using eligibility traces of the policy parameters. Most of the theoretical results for eligibility traces have been for only critic's value iteration algorithms. This paper investigates what the actor's eligibility trace does. The results show that the algorithm is an extension of Williams' REINFORCE algori...

متن کامل

Shifting Attention Using a Temporal Difference Prediction Error and High-Dimensional Input

Research on reinforcement learning has increasingly focused on the role of neuromodulatory systems implicated in associative learning. Formulations of temporal difference (TD) learning have gained a great deal of attention due to the similarity of the TD prediction error and the observed activity of dopamine neurons in the primate midbrain. Recent work has attempted to integrate additional neur...

متن کامل

Actor-Critic Models of Reinforcement Learning in the Basal Ganglia: From Natural to Artificial Rats

Since 1995, numerous Actor-Critic architectures for reinforcement learning have been proposed as models of dopamine-like reinforcement learning mechanisms in the rat’s basal ganglia. However, these models were usually tested in different tasks, and it is then difficult to compare their efficiency for an autonomous animat. We present here the comparison of four architectures in an animat as it p...

متن کامل

Simultaneous Control and Human Feedback in the Training of a Robotic Agent with Actor-Critic Reinforcement Learning

This paper contributes a preliminary report on the advantages and disadvantages of incorporating simultaneous human control and feedback signals in the training of a reinforcement learning robotic agent. While robotic human-machine interfaces have become increasingly complex in both form and function, control remains challenging for users. This has resulted in an increasing gap between user con...

متن کامل

An Actor/Critic Algorithm that is Equivalent to Q-Learning

We prove the convergence of an actor/critic algorithm that is equivalent to Q-learning by construction. Its equivalence is achieved by encoding Q-values within the policy and value function of the actor and critic. The resultant actor/critic algorithm is novel in two ways: it updates the critic only when the most probable action is executed from any given state, and it rewards the actor using c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008